ISO2024 INTRODUCTORY SPATIAL 'OMICS ANALYSIS
- HYBRID : TORONTO & ZOOM
- 10TH JULY 2024
**Module 4 : Drawing the boundaries **
Instructor : Shamini Ayyadhury
TOPICS COVERED
- A. Classical segmentation
- B. Segmentation-free
A. CLASSICAL SEGMENTATION
In [ ]:
### import the following libraries
import sys
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import scanpy as sc
#from PIL import Image
import os
import warnings
import tifffile as tiff
warnings.filterwarnings('ignore')
sys.path.append('/home/shamini/data/projects/spatial_workshop/')
import pre_processing_fnc as ppf
In [ ]:
### directory & filepaths
data_dir = '/home/shamini/data1/data_orig/data/spatial/xenium/10xGenomics/cell_seg_brain_cancer/'
out = '/home/shamini/data/projects/spatial_workshop/out/module4/'
os.makedirs(out+'module4/figures/', exist_ok=True)
We will load the following files
- transcripts_subset
- This is a smaller subset of a larger file from a human brain cancer sample.
- composite_image
- This is a correponding image file that has been reduced and processed to show the 4 channel markers for cell segmentation staining
- cell_boundaries
- THis file contains the polygon information for cell boundaries
The processing steps that were used to derive the above files can be found in supplementary script 06.
In [ ]:
cell_boundaries = pd.read_csv(out+'cell_boundaries_subset.csv')
transcripts_subset_3g = pd.read_csv(out+'transcripts_subset_3g.csv')
iF_crop = tiff.imread(out+'cropped_image_fluo.tif')
#----------------------------------------------
ppf.get_memory_usage() ### monitor memory usage
Out[Â ]:
'Memory usage: 781.73 MB'
In [ ]:
composite_img = ppf.plot_composite_image(iF_crop)
#----------------------------------------------
ppf.get_memory_usage() ### monitor memory usage
Shape of iF_crop: (4, 4705, 4705) Channel 0 max: 9350, min: 6 Channel 1 max: 10806, min: 0 Channel 2 max: 10972, min: 2 Channel 3 max: 8295, min: 0
Out[Â ]:
'Memory usage: 1035.36 MB'
STOP FOR DISCUSSION/LECTURE
In [ ]:
'''
FIGURE 1A - PLOT THE INDIVIDUAL CHANNELS & THE COMPOSITE IMAGE
'''
fig, ax = plt.subplots(1, 5, figsize=(40, 10))
fig.suptitle('A1. Individual Channels & Composite Image', fontsize=30, fontweight='bold', y=0.95, x=0.3)
for i in range(4):
ax[i].imshow(iF_crop[i,:,:], cmap='gray')
ax[i].axis('off')
ax[4].imshow(composite_img)
'''
FIGURE 1 - PLOT THE INDIVIDUAL CHANNELS & THE COMPOSITE IMAGE WITH ZOOMED IN VIEW
'''
xlower = 3500
ylower = 3500
xlim = [xlower, xlower+600]
ylim = [ylower, ylower+600]
fig, ax = plt.subplots(1, 5, figsize=(40, 10))
fig.suptitle('A2. Individual Channels & Composite Image (Zoomed In)', fontsize=30, fontweight='bold', y=0.95, x=0.35)
for i in range(4):
ax[i].imshow(iF_crop[i,:,:], cmap='gray')
ax[i].set_xlim(xlim)
ax[i].set_ylim(ylim)
ax[i].axis('off')
ax[4].imshow(composite_img)
ax[4].set_xlim(xlim)
ax[4].set_ylim(ylim)
#----------------------------------------------
ppf.get_memory_usage() ### monitor memory usage
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..1.6224868]. Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..1.6224868].
Out[Â ]:
'Memory usage: 1903.07 MB'
STOP FOR DISCUSSION/LECTURE
- Participants will now explore the images by altering the xlower and ylower paramters below
- Look carefully at the cell shapes and surrounding environments and appreciate the difficulty in solving the segmentation problem
In [ ]:
### ---------- Participants to alter the following parameters ----------
xlower = 0
ylower = 2000
### ---------------------------------------------------------------------
xlim = [xlower, xlower+600]
ylim = [ylower, ylower+600]
fig, ax = plt.subplots(1, 5, figsize=(40, 10))
fig.suptitle('A2. Individual Channels & Composite Image (Zoomed In)', fontsize=40, fontweight='bold', y=0.95, x=0.30)
for i in range(4):
ax[i].imshow(iF_crop[i,:,:], cmap='gray')
ax[i].set_xlim(xlim)
ax[i].set_ylim(ylim)
ax[i].axis('off')
ax[4].imshow(composite_img)
ax[4].set_xlim(xlim)
ax[4].set_ylim(ylim)
fig, ax = plt.subplots(figsize=(15, 15))
fig.suptitle('A3. Composite Image (Zoomed In)', fontsize=20, fontweight='bold', y=0.95, x=0.25)
ax.imshow(composite_img)
ax.set_xlim(xlim)
ax.set_ylim(ylim)
#----------------------------------------------
ppf.get_memory_usage() ### monitor memory usage
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..1.6224868]. Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..1.6224868].
Out[Â ]:
'Memory usage: 2609.59 MB'
STOP FOR DISCUSSION/LECTURE
- Now we will complete the process by aligning the the transcripts from 3 genes that are supposed to have mutually exclusive spatial expression.
- STMN1
- PTPC
- ANXA1
- And we will overlay the xenium-derived polygons over the image as well
In [ ]:
transcripts_subset_3g
Out[Â ]:
| transcript_id | cell_id | overlaps_nucleus | feature_name | x_location | y_location | z_location | qv | fov_name | nucleus_distance | codeword_index | group | binary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 281505041710735 | jgibpdce-1 | 0 | STMN1 | 0.195312 | 3196.330000 | 22.405909 | 40.000000 | AA17 | 0.851599 | 345 | gene_probes | assigned |
| 1 | 281505041605346 | UNASSIGNED | 0 | STMN1 | 0.773438 | 3971.490200 | 23.115683 | 23.350294 | AA17 | 1.518847 | 345 | gene_probes | unassigned |
| 2 | 282462819330108 | jpocnilp-1 | 0 | STMN1 | 790.425800 | 0.048828 | 21.617092 | 40.000000 | Z17 | 2.247070 | 345 | gene_probes | assigned |
| 3 | 282462819324347 | jghcnaik-1 | 0 | STMN1 | 1.914062 | 107.087890 | 22.697530 | 23.249954 | Z17 | 3.491614 | 345 | gene_probes | assigned |
| 4 | 282462819324349 | jgjmlafd-1 | 1 | STMN1 | 2.453125 | 842.318360 | 24.927584 | 34.959274 | Z17 | 0.000000 | 345 | gene_probes | assigned |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 34826 | 281513632282767 | jlikmhcm-1 | 0 | PTPRC | 4626.957000 | 3737.761700 | 24.899092 | 29.833567 | AA19 | 0.793769 | 203 | gene_probes | assigned |
| 34827 | 281513632283012 | jlkkdmkh-1 | 0 | PTPRC | 4652.390600 | 3617.697300 | 24.077530 | 40.000000 | AA19 | 0.294710 | 203 | gene_probes | assigned |
| 34828 | 281513632283171 | UNASSIGNED | 0 | PTPRC | 4668.508000 | 4177.179700 | 26.068360 | 40.000000 | AA19 | 6.217726 | 203 | gene_probes | unassigned |
| 34829 | 281513632283188 | jlkblene-1 | 1 | PTPRC | 4671.195300 | 3854.054700 | 24.882101 | 40.000000 | AA19 | 0.000000 | 203 | gene_probes | assigned |
| 34830 | 281513632283389 | jlkggjhh-1 | 0 | PTPRC | 4692.625000 | 4013.666000 | 24.045496 | 34.383205 | AA19 | 1.334653 | 203 | gene_probes | assigned |
34831 rows × 13 columns
In [ ]:
from matplotlib.patches import Polygon
### Step 1: Choose a region to zoom in
xlower = 0
ylower = 2000
xlim = [xlower, xlower+600]
ylim = [ylower, ylower+600]
### Step 2: Plot the composite fluorescence image
fig, ax = plt.subplots(figsize=(21, 21))
ax.imshow(composite_img)
ax.set_xlim(xlim)
ax.set_ylim(ylim)
### Step 3: Plot the polygons
grouped = cell_boundaries.groupby('cell_id')
for cell_id, group in grouped:
group = pd.concat([group, group[:1]])
plg = Polygon(group[['vertex_x', 'vertex_y']].values, edgecolor='r', facecolor='none')
ax.add_patch(plg)
ax.set_xlim(xlim)
ax.set_ylim(ylim)
### Step 4: Plot the transcripts for STMN1, PTPRC and ANXA1
sns.scatterplot(data=transcripts_subset_3g, x='x_location', y='y_location', hue='feature_name', ax=ax, s=42)
ax.legend(loc='upper left', bbox_to_anchor=(1, 0.5), ncol=1, fontsize=10)
#----------------------------------------------
ppf.get_memory_usage() ### monitor memory usage
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers). Got range [0.0..1.6224868].
Out[Â ]:
'Memory usage: 1408.80 MB'
DISCUSS/LECTURE
- What are the other image based segmentation that you can try?
- What factors do we need to take into account when choosing a segmentation model?
- Do you see errors? How do we evaluate them? - Next lecture.
END OF MODULE 4 : CLASSICAL SEGMENTATION
Thank you and see you in the next module where we will try a non-image based method